-
Notifications
You must be signed in to change notification settings - Fork 27
Refresh v3 implementation #1115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 16/edge
Are you sure you want to change the base?
Conversation
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Codecov Report❌ Patch coverage is ❌ Your project check has failed because the head coverage (65.98%) is below the target coverage (70.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## 16/edge #1115 +/- ##
===========================================
- Coverage 67.44% 65.98% -1.47%
===========================================
Files 18 18
Lines 3947 3904 -43
Branches 564 568 +4
===========================================
- Hits 2662 2576 -86
- Misses 1112 1143 +31
- Partials 173 185 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
… tag Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
…entation Signed-off-by: Marcelo Henrique Neppel <[email protected]>
…check for primary on unit zero Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
Signed-off-by: Marcelo Henrique Neppel <[email protected]>
…efresh is not ready (#1156) * Reorder call to refresh logic and don't immediatelly exit hook when refresh is not ready Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Log more info when creating a backup and send SIGHUP when the pgBackRest is already active Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Change log level of messages to debug Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Update unit test Signed-off-by: Marcelo Henrique Neppel <[email protected]> * Fix integration test job collection Signed-off-by: Marcelo Henrique Neppel <[email protected]> --------- Signed-off-by: Marcelo Henrique Neppel <[email protected]>
| - '**.md' | ||
| - .github/renovate.json5 | ||
| - 'docs/**' | ||
| - 'terraform/**' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to match the rule we have in the VM charm.
| [pathlib.Path.home() / "go/bin/spread", "-list", "github-ci"], | ||
| capture_output=True, | ||
| check=True, | ||
| text=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to list tests now that we're using charmcraft v4.
| "--log-level-file=debug", | ||
| "--log-subprocess", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More logging added (this helped to identify why the GCP PITR test was failing after refresh V3 was added in this PR: a SIGTERM appeared in the logs).
| logger.debug("Sending SIGHUP to pgBackRest TLS server to reload configuration") | ||
| self.container.pebble.send_signal( | ||
| signal.SIGHUP, services=[self.charm.pgbackrest_server_service] | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always restarting the service was causing a SIGTERM to be sent to the pgBackRest TLS server in the middle of the process where the replica gets the latest data from the primary to create a backup. This started to happen because of some more deferred events happening. A SIGHUP is enough to update the TLS server when the certificates change.
| # ops_test.model.deploy( | ||
| # DATABASE_APP_NAME, | ||
| # num_units=3, | ||
| # channel="16/edge", | ||
| # trust=True, | ||
| # config={"profile": "testing"}, | ||
| # base=CHARM_BASE_NOBLE, | ||
| # ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented to be restored after the first release with refresh v3 is available.
| # ops_test.model.deploy( | ||
| # DATABASE_APP_NAME, | ||
| # num_units=3, | ||
| # channel="16/stable", | ||
| # trust=True, | ||
| # config={"profile": "testing"}, | ||
| # base=CHARM_BASE_NOBLE, | ||
| # ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented to be restored after the first release with refresh v3 is available.
| assert standbys == len(ops_test.model.applications[app_name].units) - 1 | ||
|
|
||
|
|
||
| async def switchover_to_unit_zero(ops_test: OpsTest) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used to reduce the number of switchovers to the minimum (as we expect since the previous refresh/upgrade implementation).
| UNTRUST_ERROR_MESSAGE = f"Insufficient permissions, try: `juju trust {APP_NAME} --scope=cluster`" | ||
| UNTRUST_ERROR_MESSAGE = ( | ||
| f"Run `juju trust {APP_NAME} --scope=cluster`. Needed for in-place refreshes" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message changes a bit with refresh v3.
Issue
The charm is not yet supporting refresh through refresh v3.
Solution
Added refresh v3 implementation based on the developer docs and also based on the implementations from PostgreSQL VM and Kafka K8s.
Use
set_unit_statusto set the unit status and properly set statuses during refresh.The old refresh/upgrade logic/files were removed.
Correctly reload pgBackRest TLS server configuration with SIGHUP to avoid interrupting the data replication between the primary and the replica (more details in the comments on this PR).
After this PR is approved, 2a806f7 must be reverted, then the PR should me merged, to properly generate the tags in the repo (this is the same as stated at canonical/postgresql-operator#866 (comment)).
Checklist